Arar
- Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
- Asia > China > Hong Kong (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Vision > Face Recognition (0.94)
- Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
- Asia > China (0.04)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
- North America > United States > Pennsylvania (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
- Asia > Middle East > Israel (0.04)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)
Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models
Mousakhan, Arian, Mittal, Sudhanshu, Galesso, Silvio, Farid, Karim, Brox, Thomas
Existing world models for autonomous driving struggle with long-horizon generation and generalization to challenging scenarios. In this work, we develop a model using simple design choices, and without additional supervision or sensors, such as maps, depth, or multiple cameras. We show that our model yields state-of-the-art performance, despite having only 469M parameters and being trained on 280h of video data. It particularly stands out in difficult scenarios like turning maneuvers and urban traffic. We test whether discrete token models possibly have advantages over continuous models based on flow matching. To this end, we set up a hybrid tokenizer that is compatible with both approaches and allows for a side-by-side comparison. Our study concludes in favor of the continuous autoregressive model, which is less brittle on individual design choices and more powerful than the model built on discrete tokens. Code, models and qualitative results are publicly available at https://lmb-freiburg.github.io/orbis.github.io/.
- Europe > Germany > Baden-Württemberg > Freiburg (0.24)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Austria > Vienna (0.04)
- Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
- Information Technology (0.66)
- Automobiles & Trucks (0.48)
- Transportation > Ground > Road (0.35)
Scalable Policy Evaluation with Video World Models
Tseng, Wei-Cheng, Gu, Jinwei, Zhang, Qinsheng, Mao, Hanzi, Liu, Ming-Yu, Shkurti, Florian, Yen-Chen, Lin
Training generalist policies for robotic manipulation has shown great promise, as they enable language-conditioned, multi-task behaviors across diverse scenarios. However, evaluating these policies remains difficult because real-world testing is expensive, time-consuming, and labor-intensive. It also requires frequent environment resets and carries safety risks when deploying unproven policies on physical robots. Manually creating and populating simulation environments with assets for robotic manipulation has not addressed these issues, primarily due to the significant engineering effort required and the substantial sim-to-real gap, both in terms of physics and rendering. In this paper, we explore the use of action-conditional video generation models as a scalable way to learn world models for policy evaluation. We demonstrate how to incorporate action conditioning into existing pre-trained video generation models. This allows leveraging internet-scale in-the-wild online videos during the pre-training stage and alleviates the need for a large dataset of paired video-action data, which is expensive to collect for robotic manipulation. Our paper examines the effect of dataset diversity, pre-trained weights, and common failure cases for the proposed evaluation pipeline. Our experiments demonstrate that across various metrics, including policy ranking and the correlation between actual policy values and predicted policy values, these models offer a promising approach for evaluating policies without requiring real-world interactions.
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.62)
SIMA 2: A Generalist Embodied Agent for Virtual Worlds
SIMA team, null, Bolton, Adrian, Lerchner, Alexander, Cordell, Alexandra, Moufarek, Alexandre, Bolt, Andrew, Lampinen, Andrew, Mitenkova, Anna, Hallingstad, Arne Olav, Vujatovic, Bojan, Li, Bonnie, Lu, Cong, Wierstra, Daan, Sawyer, Daniel P., Slater, Daniel, Reichert, David, Vercelli, Davide, Hassabis, Demis, Hudson, Drew A., Williams, Duncan, Hirst, Ed, Pardo, Fabio, Hill, Felix, Besse, Frederic, Openshaw, Hannah, Chan, Harris, Soyer, Hubert, Wang, Jane X., Clune, Jeff, Agapiou, John, Reid, John, Marino, Joseph, Kim, Junkyung, Gregor, Karol, Sridhar, Kaustubh, McKinney, Kay, Kampis, Laura, Zhang, Lei M., Matthey, Loic, Wang, Luyu, Raad, Maria Abi, Loks-Thompson, Maria, Engelcke, Martin, Kecman, Matija, Jackson, Matthew, Gazeau, Maxime, Purkiss, Ollie, Knagg, Oscar, Stys, Peter, Mendolicchio, Piermaria, Hadsell, Raia, Ke, Rosemary, Faulkner, Ryan, Chakera, Sarah, Baveja, Satinder Singh, Legg, Shane, Kashem, Sheleem, Terzi, Tayfun, Keck, Thomas, Harley, Tim, Scholtes, Tim, Roberts, Tyson, Mnih, Volodymyr, Liu, Yulan, Wang, Zhengdong, Ghahramani, Zoubin
We introduce SIMA 2, a generalist embodied agent that understands and acts in a wide variety of 3D virtual worlds. Built upon a Gemini foundation model, SIMA 2 represents a significant step toward active, goal-directed interaction within an embodied environment. Unlike prior work (e.g., SIMA 1) limited to simple language commands, SIMA 2 acts as an interactive partner, capable of reasoning about high-level goals, conversing with the user, and handling complex instructions given through language and images. Across a diverse portfolio of games, SIMA 2 substantially closes the gap with human performance and demonstrates robust generalization to previously unseen environments, all while retaining the base model's core reasoning capabilities. Furthermore, we demonstrate a capacity for open-ended self-improvement: by leveraging Gemini to generate tasks and provide rewards, SIMA 2 can autonomously learn new skills from scratch in a new environment. This work validates a path toward creating versatile and continuously learning agents for both virtual and, eventually, physical worlds.
- Europe > Sweden > Skåne County > Malmö (0.04)
- Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)